energy dissipation
SpikeX: Exploring Accelerator Architecture and Network-Hardware Co-Optimization for Sparse Spiking Neural Networks
Xu, Boxun, Boone, Richard, Li, Peng
Spiking Neural Networks (SNNs) are promising biologically plausible models of computation which utilize a spiking binary activation function similar to that of biological neurons. SNNs are well positioned to process spatiotemporal data, and are advantageous in ultra-low power and real-time processing. Despite a large body of work on conventional artificial neural network accelerators, much less attention has been given to efficient SNN hardware accelerator design. In particular, SNNs exhibit inherent unstructured spatial and temporal firing sparsity, an opportunity yet to be fully explored for great hardware processing efficiency. In this work, we propose a novel systolic-array SNN accelerator architecture, called SpikeX, to take on the challenges and opportunities stemming from unstructured sparsity while taking into account the unique characteristics of spike-based computation. By developing an efficient dataflow targeting expensive multi-bit weight data movements, SpikeX reduces memory access and increases data sharing and hardware utilization for computations spanning across both time and space, thereby significantly improving energy efficiency and inference latency. Furthermore, recognizing the importance of SNN network and hardware co-design, we develop a co-optimization methodology facilitating not only hardware-aware SNN training but also hardware accelerator architecture search, allowing joint network weight parameter optimization and accelerator architectural reconfiguration. This end-to-end network/accelerator co-design approach offers a significant reduction of 15.1x-150.87x in energy-delay-product(EDP) without comprising model accuracy.
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Asia (0.04)
- Semiconductors & Electronics (0.68)
- Health & Medicine (0.46)
Design of a Flexible Robot Arm for Safe Aerial Physical Interaction
Mellet, Julien, Berra, Andrea, Seisa, Achilleas Santi, Sankaranarayanan, Viswa, Gamage, Udayanga G. W. K. N., Soto, Miguel Angel Trujillo, Heredia, Guillermo, Nikolakopoulos, George, Lippiello, Vincenzo, Ruggiero, Fabio
This paper introduces a novel compliant mechanism combining lightweight and energy dissipation for aerial physical interaction. Weighting 400~g at take-off, the mechanism is actuated in the forward body direction, enabling precise position control for force interaction and various other aerial manipulation tasks. The robotic arm, structured as a closed-loop kinematic chain, employs two deported servomotors. Each joint is actuated with a single tendon for active motion control in compression of the arm at the end-effector. Its elasto-mechanical design reduces weight and provides flexibility, allowing passive-compliant interactions without impacting the motors' integrity. Notably, the arm's damping can be adjusted based on the proposed inner frictional bulges. Experimental applications showcase the aerial system performance in both free-flight and physical interaction. The presented work may open safer applications for \ac{MAV} in real environments subject to perturbations during interaction.
- North America > Costa Rica > Heredia Province > Heredia (0.05)
- Europe > Spain > Andalusia > Seville Province > Seville (0.04)
- Europe > Denmark (0.04)
- (2 more...)
- Energy (0.88)
- Aerospace & Defense (0.66)
PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
PRefLexOR (Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning) combines preference optimization with concepts from Reinforcement Learning to enable models to self-teach through iterative reasoning improvements. We propose a recursive learning approach that engages the model in multi-step reasoning, revisiting, and refining intermediate steps before producing a final output in training and inference phases. Through multiple training stages, the model first learns to align its reasoning with accurate decision paths by optimizing the log odds between preferred and non-preferred responses. During this process, PRefLexOR builds a dynamic knowledge graph by generating questions from random text chunks and retrieval-augmentation to contextualize relevant details from the entire training corpus. In the second stage, preference optimization enhances model performance by using rejection sampling to fine-tune reasoning quality by continually producing in-situ training data while masking the reasoning steps. Recursive optimization within a thinking token framework introduces iterative feedback loops, where the model refines reasoning, achieving deeper coherence, consistency, and adaptability. Implemented in small language models with only 3 billion parameters, we should that even tiny models can iteratively teach themselves to reason with greater depth and reflectivity. Our implementation is straightforward and can be incorporated into any existing pretrained LLM. We focus our examples on applications in biological materials science and demonstrate the method in a variety of case studies that range from in-domain to cross-domain applications. Using reasoning strategies that include thinking and reflection modalities we build a multi-agent recursive self-improving inference approach to successively improve responses via repeated sampling in inference time.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- Europe > Netherlands (0.04)
- Workflow (1.00)
- Overview (1.00)
- Research Report > Promising Solution (0.92)
- Research Report > New Finding (0.67)
Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things
Wang, Ziheng, Reviriego, Pedro, Niknia, Farzad, Conde, Javier, Liu, Shanshan, Lombardi, Fabrizio
The implementation of machine learning in Internet of Things devices poses significant operational challenges due to limited energy and computation resources. In recent years, significant efforts have been made to implement simplified ML models that can achieve reasonable performance while reducing computation and energy, for example by pruning weights in neural networks, or using reduced precision for the parameters and arithmetic operations. However, this type of approach is limited by the performance of the ML implementation, i.e., by the loss for example in accuracy due to the model simplification. In this article, we present adaptive resolution inference (ARI), a novel approach that enables to evaluate new tradeoffs between energy dissipation and model performance in ML implementations. The main principle of the proposed approach is to run inferences with reduced precision (quantization) and use the margin over the decision threshold to determine if either the result is reliable, or the inference must run with the full model. The rationale is that quantization only introduces small deviations in the inference scores, such that if the scores have a sufficient margin over the decision threshold, it is unlikely that the full model would have a different result. Therefore, we can run the quantized model first, and only when the scores do not have a sufficient margin, the full model is run. This enables most inferences to run with the reduced precision model and only a small fraction requires the full model, so significantly reducing computation and energy while not affecting model performance. The proposed ARI approach is presented, analyzed in detail, and evaluated using different data sets for floating-point and stochastic computing implementations. The results show that ARI can significantly reduce the energy for inference in different configurations with savings between 40% and 85%.
- Europe > Spain > Galicia > Madrid (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Sichuan Province > Chengdu (0.04)
- Research Report > Promising Solution (0.34)
- Research Report > New Finding (0.34)
- Energy (1.00)
- Information Technology > Smart Houses & Appliances (0.72)
Energy-efficiency Limits on Training AI Systems using Learning-in-Memory
Chen, Zihao, Leugering, Johannes, Cauwenberghs, Gert, Chakrabartty, Shantanu
Learning-in-memory (LIM) is a recently proposed paradigm to overcome fundamental memory bottlenecks in training machine learning systems. While compute-in-memory (CIM) approaches can address the so-called memory-wall (i.e. energy dissipated due to repeated memory read access) they are agnostic to the energy dissipated due to repeated memory writes at the precision required for training (the update-wall), and they don't account for the energy dissipated when transferring information between short-term and long-term memories (the consolidation-wall). The LIM paradigm proposes that these bottlenecks, too, can be overcome if the energy barrier of physical memories is adaptively modulated such that the dynamics of memory updates and consolidation match the Lyapunov dynamics of gradient-descent training of an AI model. In this paper, we derive new theoretical lower bounds on energy dissipation when training AI systems using different LIM approaches. The analysis presented here is model-agnostic and highlights the trade-off between energy efficiency and the speed of training. The resulting non-equilibrium energy-efficiency bounds have a similar flavor as that of Landauer's energy-dissipation bounds. We also extend these limits by taking into account the number of floating-point operations (FLOPs) used for training, the size of the AI model, and the precision of the training parameters. Our projections suggest that the energy-dissipation lower-bound to train a brain scale AI system (comprising of $10^{15}$ parameters) using LIM is $10^8 \sim 10^9$ Joules, which is on the same magnitude the Landauer's adiabatic lower-bound and $6$ to $7$ orders of magnitude lower than the projections obtained using state-of-the-art AI accelerator hardware lower-bounds.
- North America > United States > Missouri > St. Louis County > St. Louis (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- (2 more...)
- Information Technology (0.67)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
Turbulence Scaling from Deep Learning Diffusion Generative Models
Whittaker, Tim, Janik, Romuald A., Oz, Yaron
Complex spatial and temporal structures are inherent characteristics of turbulent fluid flows and comprehending them poses a major challenge. This comprehesion necessitates an understanding of the space of turbulent fluid flow configurations. We employ a diffusion-based generative model to learn the distribution of turbulent vorticity profiles and generate snapshots of turbulent solutions to the incompressible Navier-Stokes equations. We consider the inverse cascade in two spatial dimensions and generate diverse turbulent solutions that differ from those in the training dataset. We analyze the statistical scaling properties of the new turbulent profiles, calculate their structure functions, energy power spectrum, velocity probability distribution function and moments of local energy dissipation. All the learnt scaling exponents are consistent with the expected Kolmogorov scaling and have lower errors than the training ones. This agreement with established turbulence characteristics provides strong evidence of the model's capability to capture essential features of real-world turbulence.
- North America > Canada > Quebec (0.14)
- Asia > Middle East > Israel (0.14)
- Europe > Poland (0.14)
How do Hamiltonian Neural Networks Operate(Machine Learning)
Abstract: Hybrid machine learning based on Hamiltonian formulations has recently been successfully demonstrated for simple mechanical systems. In this work, we stress-test the method on both simple mass-spring systems and more complex and realistic systems with several internal and external forces, including a system with multiple connected tanks. We quantify performance under various conditions and show that imposing different assumptions greatly affect the performance during training presenting advantages and limitations of the method. We demonstrate that port-Hamiltonian neural networks can be extended to larger dimensions with state-dependent ports. We consider learning on systems with known and unknown external forces and show how it can be used to detect deviations in a system and still provide a valid model when the deviations are removed.
MIT Uses AI To Discover Hidden Magnetic Properties in Multi-Layered Electronic Material
MIT researchers discovered hidden magnetic properties in multi-layered electronic material by analyzing polarized neutrons using neural networks. An MIT team incorporates AI to facilitate the detection of an intriguing materials phenomenon that can lead to electronics without energy dissipation. Superconductors have long been considered the principal approach for realizing electronics without resistivity. In the past decade, a new family of quantum materials, "topological materials," has offered an alternative but promising means for achieving electronics without energy dissipation (or loss). Compared to superconductors, topological materials provide a few advantages, such as robustness against disturbances.
- North America > United States > Pennsylvania (0.05)
- North America > United States > Minnesota (0.05)
- North America > United States > California > Los Angeles County > Los Angeles (0.05)
- Energy (0.72)
- Government > Regional Government > North America Government > United States Government (0.31)
Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning
Zhong, Yaofeng Desmond, Dey, Biswadip, Chakraborty, Amit
In this work, we introduce Dissipative SymODEN, a deep learning architecture which can infer the dynamics of a physical system with dissipation from observed state trajectories. To improve prediction accuracy while reducing network size, Dissipative SymODEN encodes the port-Hamiltonian dynamics with energy dissipation and external input into the design of its computation graph and learns the dynamics in a structured way. The learned model, by revealing key aspects of the system, such as the inertia, dissipation, and potential energy, paves the way for energy-based controllers.
Self-similarity Properties of Natural Images
Turiel, Antonio, Mato, Germán, Parga, Néstor, Nadal, Jean-Pierre
Scale invariance is a fundamental property of ensembles of natural images [1]. Their non Gaussian properties [15, 16] are less well understood, but they indicate the existence of a rich statistical structure. In this work we present a detailed study of the marginal statistics of a variable related to the edges in the images. A numerical analysis shows that it exhibits extended self-similarity [3, 4, 5]. This is a scaling property stronger than self-similarity: all its moments can be expressed as a power of any given moment. More interesting, all the exponents can be predicted in terms of a multiplicative log-Poisson process. This is the very same model that was used very recently to predict the correct exponents of the structure functions of turbulent flows [6]. These results allow us to study the underlying multifractal singularities. In particular we find that the most singular structures are one-dimensional: the most singular manifold consists of sharp edges.
- Europe > Spain > Galicia > Madrid (0.05)
- Europe > France (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Brunei (0.04)